Goto

Collaborating Authors

 Construction Machinery & Heavy Trucks


Towards Learning Boulder Excavation with Hydraulic Excavators

Gruetter, Jonas, Terenzi, Lorenzo, Egli, Pascal, Hutter, Marco

arXiv.org Artificial Intelligence

Construction sites frequently require removing large rocks before excavation or grading can proceed. Human operators typically extract these boulders using only standard digging buckets, avoiding time-consuming tool changes to specialized grippers. This task demands manipulating irregular objects with unknown geometries in harsh outdoor environments where dust, variable lighting, and occlusions hinder perception. The excavator must adapt to varying soil resistance--dragging along hard-packed surfaces or penetrating soft ground--while coordinating multiple hydraulic joints to secure rocks using a shovel. Current autonomous excavation focuses on continuous media (soil, gravel) or uses specialized grippers with detailed geometric planning for discrete objects. These approaches either cannot handle large irregular rocks or require impractical tool changes that interrupt workflow. We train a reinforcement learning policy in simulation using rigid-body dynamics and analytical soil models. The policy processes sparse LiDAR points (just 20 per rock) from vision-based segmentation and proprioceptive feedback to control standard excavator buckets. The learned agent discovers different strategies based on soil resistance: dragging along the surface in hard soil and penetrating directly in soft conditions. Field tests on a 12-ton excavator achieved 70% success across varied rocks (0.4-0.7m) and soil types, compared to 83% for human operators. This demonstrates that standard construction equipment can learn complex manipulation despite sparse perception and challenging outdoor conditions.


ExT: Towards Scalable Autonomous Excavation via Large-Scale Multi-Task Pretraining and Fine-Tuning

Zhai, Yifan, Terenzi, Lorenzo, Frey, Patrick, Soto, Diego Garcia, Egli, Pascal, Hutter, Marco

arXiv.org Artificial Intelligence

Scaling up the deployment of autonomous excavators is of great economic and societal importance. Yet it remains a challenging problem, as effective systems must robustly handle unseen worksite conditions and new hardware configurations. Current state-of-the-art approaches rely on highly engineered, task-specific controllers, which require extensive manual tuning for each new scenario. In contrast, recent advances in large-scale pretrained models have shown remarkable adaptability across tasks and embodiments in domains such as manipulation and navigation, but their applicability to heavy construction machinery remains largely unexplored. In this work, we introduce ExT, a unified open-source framework for large-scale demonstration collection, pretraining, and fine-tuning of multitask excavation policies. ExT policies are first trained on large-scale demonstrations collected from a mix of experts, then fine-tuned either with supervised fine-tuning (SFT) or reinforcement learning fine-tuning (RLFT) to specialize to new tasks or operating conditions. Through both simulation and real-world experiments, we show that pretrained ExT policies can execute complete excavation cycles with centimeter-level accuracy, successfully transferring from simulation to real machine with performance comparable to specialized single-task controllers. Furthermore, in simulation, we demonstrate that ExT's fine-tuning pipelines allow rapid adaptation to new tasks, out-of-distribution conditions, and machine configurations, while maintaining strong performance on previously learned tasks. These results highlight the potential of ExT to serve as a foundation for scalable and generalizable autonomous excavation.


High-Precision and High-Efficiency Trajectory Tracking for Excavators Based on Closed-Loop Dynamics

Zou, Ziqing, Wang, Cong, Hu, Yue, Liu, Xiao, Xu, Bowen, Xiong, Rong, Fan, Changjie, Chen, Yingfeng, Wang, Yue

arXiv.org Artificial Intelligence

Abstract-- The complex nonlinear dynamics of hydraulic excavators, such as time delays and control coupling, pose significant challenges to achieving high-precision trajectory tracking. Traditional control methods often fall short in such applications due to their inability to effectively handle these nonlinearities, while commonly used learning-based methods require extensive interactions with the environment, leading to inefficiency. T o address these issues, we introduce EfficientTrack, a trajectory tracking method that integrates model-based learning to manage nonlinear dynamics and leverages closed-loop dynamics to improve learning efficiency, ultimately minimizing tracking errors. Comparative experiments in simulation demonstrate that our method outperforms existing learning-based approaches, achieving the highest tracking precision and smoothness with the fewest interactions. Real-world experiments further show that our method remains effective under load conditions and possesses the ability for continual learning, highlighting its practical applicability. Excavators are primarily used in earthworks, mining, and construction projects, playing a vital role in tasks such as digging, loading, trenching, and leveling [1], [2], [3].


An integrated process for design and control of lunar robotics using AI and simulation

Lindmark, Daniel, Andersson, Jonas, Bodin, Kenneth, Bodin, Tora, Börjesson, Hugo, Nordfeldth, Fredrik, Servin, Martin

arXiv.org Artificial Intelligence

We envision an integrated process for developing lunar construction equipment, where physical design and control are explored in parallel. In this paper, we describe a technical framework that supports this process. It relies on OpenPLX, a readable/writable declarative language that links CAD-models and autonomous systems to high-fidelity, real-time 3D simulations of contacting multibody dynamics, machine regolith interaction forces, and non-ideal sensors. To demonstrate its capabilities, we present two case studies, including an autonomous lunar rover that combines a vision-language model for navigation with a reinforcement learning-based control policy for locomotion.


Towards Edge-Based Idle State Detection in Construction Machinery Using Surveillance Cameras

Küpers, Xander, Brinke, Jeroen Klein, Bemthuis, Rob, Incel, Ozlem Durmaz

arXiv.org Artificial Intelligence

The construction industry faces significant challenges in optimizing equipment utilization, as underused machinery leads to increased operational costs and project delays. Accurate and timely monitoring of equipment activity is therefore key to identifying idle periods and improving overall efficiency. This paper presents the Edge-IMI framework for detecting idle construction machinery, specifically designed for integration with surveillance camera systems. The proposed solution consists of three components: object detection, tracking, and idle state identification, which are tailored for execution on resource-constrained, CPU-based edge computing devices. The performance of Edge-IMI is evaluated using a combined dataset derived from the ACID and MOCS benchmarks. Experimental results confirm that the object detector achieves an F1 score of 71.75%, indicating robust real-world detection capabilities. The logistic regression-based idle identification module reliably distinguishes between active and idle machinery with minimal false positives. Integrating all three modules, Edge-IMI enables efficient on-site inference, reducing reliance on high-bandwidth cloud services and costly hardware accelerators. We also evaluate the performance of object detection models on Raspberry Pi 5 and an Intel NUC platforms, as example edge computing platforms. We assess the feasibility of real-time processing and the impact of model optimization techniques.


A simulation framework for autonomous lunar construction work

Linde, Mattias, Lindmark, Daniel, Ålstig, Sandra, Servin, Martin

arXiv.org Artificial Intelligence

We present a simulation framework for lunar construction work involving multiple autonomous machines. The framework supports modelling of construction scenarios and autonomy solutions, execution of the scenarios in simulation, and analysis of work time and energy consumption throughout the construction project. The simulations are based on physics-based models for contacting multibody dynamics and deformable terrain, including vehicle-soil interaction forces and soil flow in real time. A behaviour tree manages the operational logic and error handling, which enables the representation of complex behaviours through a discrete set of simpler tasks in a modular hierarchical structure. High-level decision-making is separated from lower-level control algorithms, with the two connected via ROS2. Excavation movements are controlled through inverse kinematics and tracking controllers. The framework is tested and demonstrated on two different lunar construction scenarios that involve an excavator and dump truck with actively controlled articulated crawlers.


Automatic Operation of an Articulated Dump Truck: State Estimation by Combined QZSS CLAS and Moving-Base RTK Using Multiple GNSS Receivers

Suzuki, Taro, Kojima, Shotaro, Ohno, Kazunori, Miyamoto, Naoto, Suzuki, Takahiro, Asano, Kimitaka, Komatsu, Tomohiro, Kakizaki, Hiroto

arXiv.org Artificial Intelligence

Labor shortage due to the declining birth rate has become a serious problem in the construction industry, and automation of construction work is attracting attention as a solution to this problem. This paper proposes a method to realize state estimation of dump truck position, orientation and articulation angle using multiple GNSS for automatic operation of dump trucks. RTK-GNSS is commonly used for automation of construction equipment, but in mountainous areas, mobile networks often unstable, and RTK-GNSS using GNSS reference stations cannot be used. Therefore, this paper develops a state estimation method for dump trucks that does not require a GNSS reference station by using the Centimeter Level Augmentation Service (CLAS) of the Japanese Quasi-Zenith Satellite System (QZSS). Although CLAS is capable of centimeter-level position estimation, its positioning accuracy and ambiguity fix rate are lower than those of RTK-GNSS. To solve this problem, we construct a state estimation method by factor graph optimization that combines CLAS positioning and moving-base RTK-GNSS between multiple GNSS antennas. Evaluation tests under real-world environments have shown that the proposed method can estimate the state of dump trucks with the same accuracy as conventional RTK-GNSS, but does not require a GNSS reference station.

  Country:
  Genre: Research Report (0.64)
  Industry:

PersonaBOT: Bringing Customer Personas to Life with LLMs and RAG

Rizwan, Muhammed, Carlsson, Lars, Loni, Mohammad

arXiv.org Artificial Intelligence

The introduction of Large Language Models (LLMs) has significantly transformed Natural Language Processing (NLP) applications by enabling more advanced analysis of customer personas. At Volvo Construction Equipment (VCE), customer personas have traditionally been developed through qualitative methods, which are time-consuming and lack scalability. The main objective of this paper is to generate synthetic customer personas and integrate them into a Retrieval-Augmented Generation (RAG) chatbot to support decision-making in business processes. To this end, we first focus on developing a persona-based RAG chatbot integrated with verified personas. Next, synthetic personas are generated using Few-Shot and Chain-of-Thought (CoT) prompting techniques and evaluated based on completeness, relevance, and consistency using McNemar's test. In the final step, the chatbot's knowledge base is augmented with synthetic personas and additional segment information to assess improvements in response accuracy and practical utility. Key findings indicate that Few-Shot prompting outperformed CoT in generating more complete personas, while CoT demonstrated greater efficiency in terms of response time and token usage. After augmenting the knowledge base, the average accuracy rating of the chatbot increased from 5.88 to 6.42 on a 10-point scale, and 81.82% of participants found the updated system useful in business contexts.


Development of CPS Platform for Autonomous Construction

Kasahara, Yuichiro, Akinari, Kota, Kouno, Tomoya, Sano, Noriko, Abe, Taro, Yamauchi, Genki, Endo, Daisuke, Hashimoto, Takeshi, Nagatani, Keiji, Kurazume, Ryo

arXiv.org Artificial Intelligence

In recent years, labor shortages due to the declining birthrate and aging population have become significant challenges at construction sites in developed countries, including Japan. To address these challenges, we are developing an open platform called ROS2-TMS for Construction, a Cyber-Physical System (CPS) for construction sites, to achieve both efficiency and safety in earthwork operations. In ROS2-TMS for Construction, the system comprehensively collects and stores environmental information from sensors placed throughout the construction site. Based on these data, a real-time virtual construction site is created in cyberspace. Then, based on the state of construction machinery and environmental conditions in cyberspace, the optimal next actions for actual construction machinery are determined, and the construction machinery is operated accordingly. In this project, we decided to use the Open Platform for Earthwork with Robotics and Autonomy (OPERA), developed by the Public Works Research Institute (PWRI) in Japan, to control construction machinery from ROS2-TMS for Construction with an originally extended behavior tree. In this study, we present an overview of OPERA, focusing on the newly developed navigation package for operating the crawler dump, as well as the overall structure of ROS2-TMS for Construction as a Cyber-Physical System (CPS). Additionally, we conducted experiments using a crawler dump and a backhoe to verify the aforementioned functionalities.

  Country:
  Genre:
  Industry:

Federated Learning framework for LoRaWAN-enabled IIoT communication: A case study

Sanchez, Oscar Torres, Borges, Guilherme, Raposo, Duarte, Rodrigues, André, Boavida, Fernando, Silva, Jorge Sá

arXiv.org Artificial Intelligence

The development of intelligent Industrial Internet of Things (IIoT) systems promises to revolutionize operational and maintenance practices, driving improvements in operational efficiency. Anomaly detection within IIoT architectures plays a crucial role in preventive maintenance and spotting irregularities in industrial components. However, due to limited message and processing capacity, traditional Machine Learning (ML) faces challenges in deploying anomaly detection models in resource-constrained environments like LoRaWAN. On the other hand, Federated Learning (FL) solves this problem by enabling distributed model training, addressing privacy concerns, and minimizing data transmission. This study explores using FL for anomaly detection in industrial and civil construction machinery architectures that use IIoT prototypes with LoRaWAN communication. The process leverages an optimized autoencoder neural network structure and compares federated models with centralized ones. Despite uneven data distribution among machine clients, FL demonstrates effectiveness, with a mean F1 score (of 94.77), accuracy (of 92.30), TNR (of 90.65), and TPR (92.93), comparable to centralized models, considering airtime of trainning messages of 52.8 min. Local model evaluations on each machine highlight adaptability. At the same time, the performed analysis identifies message requirements, minimum training hours, and optimal round/epoch configurations for FL in LoRaWAN, guiding future implementations in constrained industrial environments.